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The  use  of  hierarchy  is  an  important  component  of  object-oriented  design.  Hierarchy  allows  the  use  of 
type  families,  in  which  higher  level  sup)ertypes  capture  the  behavior  that  all  of  their  subtypes  have  in 
common.  For  this  methooology  to  be  effective,  it  is  necessary  to  have  a  clear  understanding  of  how 
subtypes  and  supertypes  are  related.  This  paper  takes  the  position  that  the  relationship  should  ensure 
that  arty  property  proved  about  supertype  objects  also  holds  for  its  subtype  objects.  It  presents  two  ways 
of  defining  the  subt^^  relation,  each  of  which  meets  this  criterion,  and  each  of  which  is  easy  for 
programmers  to  use.  The  paper  also  discusses  the  ramifications  of  this  notion  of  subtyping  on  the  design 
of  tfpe  families. 
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Abstract 


The  use  of  hierarchy  is  an  important  component  of  object-oriented  design.  Hierarchy 
allows  the  use  of  type  families,  in  which  higher  level  supertypes  capture  the  behavior  that 
all  of  their  subtypes  have  in  common.  For  this  methodology  to  be  effective,  it  is  necessary 
to  have  a  clear  understanding  of  how  subtypes  and  supertypes  are  related.  This  paper  takes 
the  position  that  the  relationship  should  ensure  that  any  property  proved  about  supertype 
objects  also  holds  for  its  subtype  objects.  It  presents  two  ways  of  defining  the  subtype 
relation,  each  of  which  meets  this  criterion,  and  each  of  which  is  easy  for  programmers 
to  use.  The  subtype  relation  is  based  on  the  specifications  of  the  sub-  and  supertypes; 
the  paper  presents  a  way  of  specifying  types  that  makes  it  convenient  to  define  the  subtype 
relation.  The  paper  also  discusses  the  ramifications  of  this  notion  of  subtyping  on  the  design 
of  type  families. 


1  Introduction 


What  does  it  mean  for  one  type  to  be  a  subtype  of  another?  We  argue  that  this  is  a  semantic 
question  having  to  do  with  the  behavior  of  the  objects  of  the  two  types:  the  objects  of  the 
subtype  ought  to  behave  the  same  as  those  of  the  supertype  as  far  as  anyone  or  any  program 
using  supertype  objects  can  tell. 

For  example,  in  strongly  typed  object-oriented  languages  such  as  Simula  67[9],  C-|-4-[35], 
Modula-3[32],  and  Trellis/0 wl[33],  subtypes  are  used  to  broaden  the  assignment  statement.  An 
assignment 

x:  T  :=  E 
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is  legal  provided  the  type  of  expression  E  is  a  subtype  of  the  declared  type  T  of  variable  x. 
Once  the  assignment  has  occurred,  x  will  be  used  according  to  its  “apparent”  type  T,  with  the 
expectation  that  if  the  program  performs  correctly  when  the  actual  type  of  x’s  object  is  T,  it 
will  also  work  correctly  if  the  actual  type  of  the  object  denoted  by  x  is  a  subtype  of  T. 

Clearly  subtypes  must  provide  the  expected  methods  with  compatible  signatures.  This 
consideration  has  led  to  the  formulation  of  the  contra/covariance  rules[3,  33,  5].  However,  these 
rules  are  not  strong  enough  to  ensure  that  the  program  containing  the  above  assignment  will 
work  correctly  for  any  subtype  of  T,  since  all  they  do  is  ensure  that  no  type  errors  will  occur.  It 
is  well  known  that  type  checking,  while  very  useful,  captures  only  a  small  part  of  what  it  means 
for  a  program  to  be  correct;  the  same  is  true  for  the  contra/ covariance  rules.  For  example, 
stacks  and  queues  might  both  have  a  put  method  to  add  an  element  and  a  gf‘t  method  to 
remove  one.  According  to  the  contravariance  rule,  either  could  be  a  legal  subtype  of  the  other. 
However,  a  program  written  in  the  expectation  that  x  is  a  stack  is  unlikely  to  work  correctly  if 
X  actually  denotes  a  queue,  and  vice  versa. 

What  is  needed  is  a  stronger  requirement  that  constrains  the  behavior  of  subtypes:  prop¬ 
erties  that  can  be  proved  using  the  specification  of  an  object’s  presumed  type  should  hold 
even  though  the  object  is  actually  a  member  of  a  subtype  of  that  type.  This  paper’s  main 
contribution  is  to  provide  two  general,  yet  easy  to  use,  definitions  of  the  subtype  relation  that 
precisely  capture  this  subtype  requirement.  Our  definitions  extend  earlier  work,  including  the 
most  closely  related  work  done  by  America[2],  by  allowing  subtypes  to  have  more  methods  than 
their  supertypes.  They  apply  even  in  a  very  general  environment  in  which  possibly  concurrent 
users  share  mutable  objects.  Our  approach  is  also  constructive:  One  can  prove  whether  a  sub- 
type  relation  holds  by  proving  a  small  number  of  simple  lemmas  based  on  the  specifications  of 
the  two  types. 

Our  paper  makes  two  other  contributions.  First,  it  provides  a  way  of  specifying  object 
types  that  allows  a  type  to  have  multiple  implementations  and  makes  it  convenient  to  define 
the  subtyping  relation.  Our  specifications  are  formal,  which  means  that  they  have  a  precise 
mathematical  meaning  that  serves  as  a  firm  foundation  for  reasoning.  Our  specifications  can 
also  be  used  informaUy  as  described  in  [27]. 

Second,  it  explores  the  ramifications  of  the  subtype  relation  and  shows  how  interesting  type 
families  can  be  defined.  For  example,  arrays  are  not  a  subtype  of  sequences  (because  the  user 
of  a  sequence  expects  it  not  to  change  over  time)  and  32-bit  integers  are  not  a  subtype  of  64-bit 
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integers  (because  a  user  of  64-bit  integers  would  expect  certain  method  calls  to  succeed  that 
will  fail  when  applied  to  32-bit  integers).  However,  type  families  can  be  defined  that  group  such 
related  types  together. 

The  paper  is  organized  as  follows.  Section  2  discusses  in  more  detail  what  we  require  of 
our  subtype  relation  and  provides  the  motivation  for  our  approach.  Next  we  describe  our 
model  of  computation  and  then  present  our  specification  method.  Section  5  presents  our  two 
definitions  of  subtyping  and  Section  6  discusses  the  ramifications  of  our  approach  on  designing 
type  hierarchies.  We  describe  related  work  in  Section  7  and  then  close  with  a  summary  of 
contributions. 

2  Motivation 

To  motivate  the  basic  idea  behind  our  notion  of  subtyping,  let’s  look  at  an  example.  Consider 
a  bounded  bag  type  that  provides  a  put  method  that  inserts  elements  into  a  bag  and  a  get 
method  that  removes  an  arbitrary  element  from  a  bag.  Put  has  a  pre-condition  that  checks  to 
see  that  adding  an  element  will  not  grow  the  bag  beyond  its  bound;  get  has  a  pre-condition 
that  checks  to  see  that  the  bag  is  non-empty. 

Consider  also  a  bounded  stack  type  that  has,  in  addition  to  push  and  pop  methods,  a 
swap.top  method  that  takes  an  integer,  i,  and  modifies  the  stack  by  replacing  its  top  with  i. 
Stack’s  push  and  pop  methods  have  pre-conditions  similar  to  bag’s  put  and  get,  and  swapJop 
has  a  pre-condition  requiring  that  the  stack  is  non-empty. 

Intuitively,  stack  is  a  subtype  of  bag  because  both  are  collections  that  retain  an  element 
added  by  putjpush  until  it  is  removed  by  get/pop.  The  get  method  for  bags  does  not  specify 
precisely  what  element  is  removed;  the  pop  method  for  stack  is  more  constrained,  but  what 
it  does  is  one  of  the  permitted  behaviors  for  bag’s  get  method.  Let’s  ignore  swap.top  for  the 
moment. 

Suppose  we  want  to  show  stack  is  a  subtype  of  bag.  We  need  to  relate  the  values  of  stacks  to 
those  of  bags.  This  can  be  done  by  means  of  an  abstraction  function,  like  that  used  for  proving 
the  correctness  of  implementations  [19].  A  given  stack  value  maps  to  a  bag  value  where  we 
abstract  from  the  insertion  order  on  the  elements. 

We  also  need  to  relate  stack’s  methods  to  b^’s.  Clearly  there  is  a  correspondence  between 
stack’s  put  method  and  bag’s  push  and  similarly  for  the  get  and  pop  methods  (even  though 
the  names  of  the  corresponding  methods  do  not  match).  The  pre-  and  post-conditions  of 
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corresponding  methods  will  need  to  relate  in  some  precise  (to  be  defined)  way.  In  showing  this 
relationship  we  need  to  appeal  to  the  abstraction  function  so  that  we  can  reason  about  stack 
values  in  terms  of  their  corresponding  bag  values. 

Finally,  what  about  swapJopl  Most  other  definitions  of  the  subtype  relation  have  ignored 
such  “extra”  methods,  and  it  is  perfectly  adequate  do  so  when  procedures  are  considered  in 
isolation  and  there  is  no  aliasing.  In  such  a  constrained  situation,  a  program  that  uses  an 
object  that  is  apparently  a  bag  but  is  actually  a  stack  will  never  call  the  extra  methods,  and 
therefore  their  behavior  is  irrelevant.  However,  we  cannot  ignore  extra  methods  in  the  presence 
of  aliasing,  and  also  in  a  general  computational  environment  that  allows  sharing  of  mutable 
objects  by  multiple  users. 

Consider  first  the  case  of  aliasing.  The  problem  here  is  that  within  a  procedure  an  object 
is  accessible  by  more  than  one  name,  so  that  modifications  using  one  of  the  names  are  visible 
when  the  object  is  accessed  using  the  other  name.  For  example,  suppose  <t  is  a  subtype  of  r 
and  that  variables 

x:  T 

y:  <T 

both  denote  the  same  object  (which  must,  of  course,  belong  to  cr  or  one  of  its  subtypes).  When 
the  object  is  accessed  through  x,  only  r  methods  can  be  called.  However,  when  it  is  used 
through  y,  c  methods  can  be  called  and  the  effects  of  these  methods  are  visible  later  when  the 
object  is  accessed  via  x.  To  reason  about  the  use  of  variable  x  using  the  specification  of  its  type 
r,  we  need  to  impose  additional  constraints  on  the  subtype  relation. 

Now  consider  the  case  of  an  environment  of  shared  mutable  objects,  such  as  is  provided 
by  object-oriented  databases  (e.g.,  Thor  [26]  and  Gemstone  [29]).  (In  fact,  it  was  our  interest 
in  Thor  that  motivated  us  to  study  the  meaning  of  the  subtype  relation  in  the  first  place.) 
In  such  systems,  there  is  a  universe  containing  shared,  mutable  objects  and  a  way  of  naming 
those  objects.  In  general,  lifetimes  of  objects  may  be  longer  than  the  programs  that  create 
and  access  them  (i.e.,  objects  might  be  persistent)  and  users  (or  programs)  may  access  objects 
concurrently  and/or  aperiodically  for  varying  lengths  of  time.  Of  course  there  is  a  need  for 
some  form  of  concurrency  control  in  such  an  environment.  We  assume  such  a  mechanism  is  in 
place,  and  consider  a  computation  to  be  made  up  out  of  atomic  units  (i.e.,  transactions)  that 
exclude  one  another.  The  transactions  of  different  computations  can  be  interleaved  and  thus 
one  computation  is  able  to  observe  the  modifications  mauie  by  another. 
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If  there  were  subtyping  in  such  an  environment  the  following  situation  might  occur.  A  user 
installs  a  directory  object  that  maps  string  names  to  bags.  Later,  a  second  user  enters  a  stack 
into  the  directory  under  some  string  name;  such  binding  is  analogous  to  assigning  a  subtype 
object  to  a  variable  of  the  supertype.  After  this,  both  users  occasionedly  access  the  stack  object. 
The  second  user  knows  it  is  a  stack  and  accesses  it  using  stack  methods.  The  question  is:  What 
does  the  first  user  need  to  know  in  order  for  his  or  her  programs  to  make  sense? 

We  think  it  ought  to  be  sufficient  for  a  user  to  only  know  about  the  “apparent  type”  of  the 
object;  the  subtype  ought  to  preserve  any  properties  that  can  be  proved  about  the  supertype. 
We  are  concerned  only  with  safety  properties  (“nothing  bad  happens”).  There  are  two  kinds 
of  safety  properties:  invariant  properties,  which  are  properties  true  of  all  states,  and  history 
properties,  which  are  properties  true  of  all  sequences  of  states.  For  example,  an  invariant 
property  of  a  bag  is  that  its  size  is  always  less  than  its  bound;  a  history  property  is  that  its 
bound  does  not  change.  We  might  also  want  to  prove  liveness  properties  (“something  good 
eventually  happens”),  e.g.,  the  size  of  a  bag  will  eventually  reach  the  bound,  but  our  focus  here 
will  be  just  on  safety  properties. 

Thus  the  first  user  ought  to  be  able  to  reason  about  his  or  her  use  of  the  stack  object 
using  invariant  and  history  properties  of  bag.  Both  of  our  definitions  of  subtype  assume  a  type 
specification  includes  an  explicit  invariant  clause  that  states  the  type  invariants  that  must 
be  preserved  by  any  of  it  subtypes.  Our  two  definitions  differ  in  the  way  they  handle  extra 
methods,  and  thus  in  their  way  of  ensuring  that  history  properties  are  preserved: 

•  Our  first  definition  deals  with  the  history  properties  directly.  We  add  to  a  type’s  specifi¬ 
cation  a  constraint  clause  that  captures  exactly  those  history  properties  of  a  type  that 
must  be  preserved  by  any  of  its  subtypes,  and  we  prove  that  each  of  the  type’s  methods 
preserves  the  constraint.  Showing  that  <7  is  a  subtype  of  r  requires  showing  that  cr’s 
constraint  implies  r’s  (under  the  abstraction  function). 

•  Our  second  definition  deals  with  history  properties  indirectly.  For  each  extra  method,  we 
require  that  an  “explanation”  be  given  of  how  its  behavior  could  be  effected  by  just  those 
methods  already  defined  for  the  supertype.  The  explanation  guarantees  that  the  extra 
method  does  not  introduce  any  behavior  that  was  not  already  present,  and  therefore  it 
does  not  interfere  with  any  history  property. 

For  example,  using  the  first  approach  we  would  state  constraints  for  both  bags  and  stacks. 
In  this  particular  example,  the  two  constraints  are  identical;  both  state  that  the  bound  of  the 
bag  (or  stack)  does  not  change.  The  extra  method  swap.top  is  permitted  because  it  does  not 
change  the  stack’s  bound.  Showing  that  the  constraint  for  stack  implies  that  of  bag  is  trivial. 
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Using  the  second  approach,  we  would  provide  an  explanation  for  swap-top  in  terms  of  existing 
methods: 

s.swap-top(i)  =  s.popQ;  8.pu8h(i) 

and  we  would  prove  that  the  explanation  program  really  does  simulate  swap-top’s  behavior. 

In  Section  5  we  present  and  discuss  these  two  alternative  definitions.  First,  however,  we 
define  our  model  of  computation,  and  then  discuss  specifications,  since  these  define  the  objects, 
values,  and  methods  that  will  be  related  by  the  subtype  relation. 

3  Model  of  Computation 

We  assume  a  set  of  all  potentially  existing  objects,  Obj,  partitioned  into  disjoint  typed  sets. 

Each  object  has  a  unique  identity.  A  type  defines  a  set  of  values  for  an  object  and  a  set  of 

methods  that  provide  the  only  means  to  manipulate  that  object. 

Objects  can  be  created  and  manipulated  in  the  course  of  program  execution.  A  state  defines 

a  value  for  each  existing  object.  It  is  a  pair  of  mappings,  an  environment  and  a  store.  An 

environment  maps  program  variables  to  objects;  a  store  maps  objects  i,o  values. 

State  =  Env  x  Store 
Env  =  Var  — >  Obj 
Store  =  Obj  -*  Val 

Given  a  variable,  i,  and  a  state,  p,  with  an  environment,  p.e^  and  store,  p.s,  we  use  the  notation 
Xp  to  denote  the  value  of  x  in  state  p;  i.e.,  Xp  =  p.s{p.e{x)).  When  we  refer  to  the  domain  of  a 
state,  dom{p),  we  mean  more  precisely  the  domain  of  the  store  in  that  state. 

We  model  a  type  as  a  triple,  <  0,V,M  >,  where  O  C  Obj  is  a  set  of  objects,  V  C  Val 
is  a  set  of  values,  and  Af  is  a  set  of  methods.  Each  method  for  an  object  is  a  constructor, 
an  observer,  or  a  mutator.  Constructors  of  an  object  of  type  t  return  new  objects  of  type  r; 
observers  return  results  of  other  types;  mutators  modify  the  values  of  objects  of  type  r.  A  type 
is  mutable  if  any  of  its  methods  is  a  mutator.  We  allow  “mixed  methods”  where  a  constructor 
or  an  observer  can  also  be  a  mutator.  We  also  allow  methods  to  signal  exceptions;  we  assume 
termination  exceptions,  i.e.,  each  method  call  either  terminates  normally  or  in  one  of  a  number 
of  named  exception  conditions.  To  be  consistent  with  object-oriented  language  notation,  we 
write  x.m(a)  to  denote  the  call  of  method  m  on  object  x  with  the  sequence  of  arguments  a. 

Objects  come  into  existence  and  get  their  initial  values  through  creators.  Unlike  other  kinds 
of  methods,  creators  do  not  belong  to  particular  objects,  but  rather  are  independent  operations. 
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They  are  the  “class  methods”;  the  other  methods  are  the  “instance  methods.”  (We  are  ignoring 
other  kinds  of  class  methods  in  this  paper.) 

A  computation,  i.e.,  program  execution,  is  a  sequence  of  alternating  states  and  statements 
starting  in  some  initial  state,  po: 

PO  Sx  Pi  ...  Pn— 1  Sn  Pn 

Each  statement,  5,-,  of  a  computation  sequence  is  a  partial  function  on  states.  A  history  is  the 
subsequence  of  states  of  a  computation.  A  state  can  change  over  time  in  only  three  ways^:  the 
environment  can  change  through  assignment;  the  store  can  change  through  the  invocation  of  a 
mutator;  the  domain  can  change  through  the  invocation  of  a  creator  or  constructor.  We  assume 
the  execution  of  each  statement  is  atomic.  Objects  are  never  destroyed; 

V  1  <  t  <  n  .  dom(p,_i)  C  dom(pi). 


4  Specifications 

4.1  Type  Specifications 

A  type  specification  includes  the  following  information; 

•  The  type’s  name; 

•  A  description  of  the  type’s  value  space; 

•  For  each  of  the  type’s  methods; 

-  Its  name; 

-  Its  signature  (including  signaled  exceptions); 

-  Its  behavior  in  terms  of  pre-conditions  and  post-conditions. 

Note  that  the  creators  are  missing.  Creators  are  specified  separately  to  make  it  easy  for  a 
type  to  have  multiple  implementations,  to  allow  subtypes  to  have  different  creators  from  their 
jupertypes,  and  to  make  it  more  convenient  to  define  subtypes.  We  show  how  to  specify  creators 
in  Section  4.2.  However,  the  absence  of  creators  means  that  data  type  induction  cannot  be  used 
to  reason  about  invariant  properties.  In  Section  4.3  we  discuss  how  we  make  up  for  this  loss  by 
adding  invariants  to  type  specifications. 

In  our  work  we  use  formal  specifications  in  the  two-tiered  style  of  Larch  [16].  The  first  tier 
defines  sorts,  which  are  used  to  define  the  value  spaces  of  objects.  In  the  second  tier.  Larch 

'This  model  is  based  on  CLU  semantics[13]. 
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bag  =  type 


uses  BBag  (bag  for  B) 
for  all  6:  bag 


put  =  proc  (t;  int) 

requires  |  bj^e.elems  |  <  bj^e-^ound 
modifies  b 

ensures  bpott-^iems  =  6pre  U  {*}  A  bpott-f>o»nd  =  bpre-bound 


get  =  proc  (  )  returns  (int) 
requires  6^,*  #  {} 
modifies  b 

ensures  bpott-elems  =  bpre.elems  —  {result}  A  result  6  b^e.elems  A 
bpogt-bound  =  bpre-bound 


card  =  proc  (  )  returns  (int) 

ensures  result  =  |  bpre-elems  | 


equal  =  proc  (a:  bag)  returns  (bool) 
ensures  result  =  (a  =  6) 


end  bag 

Figure  1:  A  Type  Specification  for  Bags 

interfaces  are  used  to  define  types.  For  example,  Figure  1  gives  a  specification  for  a  bag  type 
whose  objects  have  methods  put,  get,  card,  and  equal.  The  uses  clause  defines  the  value  space 
for  the  type  by  identifying  a  sort.  The  clause  in  the  figure  indicates  that  values  of  objects  of 
type  bag  are  denotable  by  terms  of  sort  B  introduced  in  the  BBag  specification;  a  value  of 
this  sort  is  a  pair,  <  elems,  bound  >,  where  elems  is  a  mathematical  multiset  of  integers  and 
bound  is  a  natural  number.  The  notation  {  }  stands  for  the  empty  multiset,  U  is  a  commutative 
operation  on  multisets  that  does  not  discard  duplicates,  and  |  z  |  is  a  cardinality  operation  that 
returns  the  total  number  of  elements  in  the  multiset  z. 

The  body  of  a  type  specification  provides  a  specification  for  each  method.  Since  a  method’s 
specification  needs  to  refer  to  the  method’s  object,  we  introduce  a  name  for  that  object  in 
the  for  all  line.  A  requires  clause  gives  a  method’s  pre-condition;  e.g.,  put's  pre-condition 
checks  to  see  that  adding  an  element  will  not  grow  the  bag  beyond  its  bound.  If  the  clause  is 
missing,  the  pre-condition  is  trivially  “true.”  A  modifies  zi, . . .  ,z„  clause  is  shorthand  for  the 
predicate: 

V  z  e  {dom{pre)  -  {zi,...,z„})  •  —  Xpoet 
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which  says  only  objects  listed  may  change  in  value.  A  modifies  clause  is  a  strong  statement 
about  all  objects  not  explicitly  listed,  i.e.,  their  values  may  not  change.  If  there  is  no  modifies 
clause  then  nothing  may  change.  The  post-condition  is  the  conjunction  of  the  modifies  and 
ensures  clauses;  e.g.,  put’s  post-condition  says  that  the  bag’s  value  changes  by  the  addition  of 
its  integer  argument.  For  method  m,  we  write  m.pre  to  denote  its  pre-condition  and  m.poat  its 
post-condition. 

In  the  requires  and  ensures  clauses  x  stands  for  an  object,  Xp^e  for  its  value  in  the  initial 
state,  and  Xpoat  for  its  value  in  the  final  state.^  Distinguishing  between  initial  and  final  values 
is  necessary  only  for  mutable  types,  so  we  suppress  the  subscripts  for  parameters  of  immutable 
types  (like  integers).  We  need  to  distinguish  between  an  object,  x,  and  its  value,  x^re  or 
Xpoit,  because  we  sometimes  need  to  refer  to  the  object  itself,  e.g.,  in  the  equal  method,  which 
determines  whether  two  (mutable)  bags  are  identical.  Result  is  a  way  to  name  a  method’s  result 
parameter. 

Methods  may  terminate  normally  or  exceptionally;  the  exceptions  are  listed  in  a  signals 
clause  in  the  method’s  header.  For  example,  an  aJternative  specification  for  the  get  method  is 

get  =  proc  (  )  returns  (int)  signals  (empty) 
modifies  b 

ensures  if  bprt.elems  =  {  }  then  signal  empty 

else  bpoafclems  =  bpre.elems  -  {result}  A  result  €  bprt.elems  A 
bpoat  .bound  =  bprt.bound 

4.2  Specifying  Creators 

Objects  are  created  and  initialized  through  creators.  Figure  2  shows  specifications  for  three 
different  creators  for  bags.  The  first  creator  creates  a  new  empty  bag  whose  bound  is  its  integer 
argument.  The  second  and  third  creators  fix  the  bag’s  bound  to  be  100.  The  third  creator  uses 
its  integer  argument  to  create  a  singleton  bag.  The  assertion  new{x)  stands  for  the  predicate: 

X  6  dom(post)  —  dom(pr€) 

Recall  that  objects  are  never  destroyed  so  that  dom{pre)  C  dom{post). 

^Referring  to  an  object’s  final  value  is  meaningless  in  pre-conditions,  of  course. 
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cnate=  proc  (n:  int)  returns  (bag) 
requires  n  >  0 

ensures  uew(re»vJt)  Aresidtpott  =  <{}«"> 

create jmall  =  proc  (  )  returns  (ba^) 

ensures  ne:w(result)  Aresultpott  =  <  {}il00  > 

creaie^ingle  =  proc  (t;  int)  returns  (bag) 

ensures  new  (result)  hresultpott  =  <  {*}*  100  > 

Figure  2:  Creator  Spediications  for  Bags 

4.3  Type  Specifications  Need  Explicit  Invariants 

By  not  including  creators  in  type  specifications  we  lose  a  powerful  reasoning  tool:  data  type 
induction.  Data  type  induction  is  used  to  prove  type  invariants.  The  base  case  of  the  rule 
requires  that  each  creator  of  the  type  establish  the  invariant;  the  inductive  case  requires  that 
each  method  preserve  the  invariant.  Without  the  creators,  we  have  no  base  case,  and  therefore 
we  cannot  prove  type  invariants! 

To  compensate  for  the  lack  of  data  type  induction,  we  state  the  invariant  explicitly  in  the 
type  specification  by  means  of  an  invariant  clause;  if  the  invariant  is  trivial  (i.e.,  identical  to 
“true”),  the  clause  can  be  omitted.  The  invariant  defines  the  legal  values  of  its  type  r.  For 
example,  we  add 

invariant  |  bp.elems  \  <  bp.bound 

to  the  type  specification  of  Figure  1  to  state  that  the  size  of  a  bounded  bag  never  exceeds 
its  bound.  The  predicate  <f>{xp)  appearing  in  an  invariant  clause  for  type  r  stands  for  the 
predicate: 

'ix  :  T,  p  :  State  .  4>{xp) 

Any  additional  invariant  properties  must  foUow  from  the  conjunction  of  the  type’s  invariant 
and  invariants  that  hold  for  the  entire  value  space.  For  example,  we  could  show  that  the  size  of 
a  bag  is  nonnegative  because  this  is  true  for  all  mathematical  multiset  values.  Since  additional 
invariants  cannot  be  proved  using  data  type  induction,  the  specifier  must  be  careful  to  define 
an  invariant  that  is  strong  enough  to  support  all  desired  invariants. 

We  must  show  that  the  specification  preserves  the  invariant.  All  creators  for  a  type  r  must 
establish  r’s  invariant,  Ir'. 

•  For  each  creator  for  type  r,  show  Ir{r€sultpo,t). 
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In  addition,  each  method  of  the  type  must  preserve  the  invariant.  To  prove  this,  we  assume 
each  method  is  called  on  an  object  of  type  r  with  a  legal  value  (one  that  satisfies  the  invariant), 
and  show  that  any  value  of  a  r  object  it  produces  or  modifies  is  legal; 

•  For  each  method  m  of  r,  assume  Iri^pre)  and  show  Irixpost)- 

For  example,  we  would  need  to  show  put,  get,  card,  and  equal  each  preserves  the  invariant  for 
bag.  Informally  the  invariant  holds  because  put’s  pre-condition  checks  that  there  is  enough 
room  in  the  bag  for  another  element;  get  either  decreases  the  size  of  the  bag  or  leaves  it  the 
same;  card  and  equal  do  not  change  the  bag  at  all.  The  proof  ensures  that  methods  deal  with 
only  legal  values  of  an  object’s  type. 

5  The  Meaning  of  Subtype 

5.1  Specifying  Subtypes 

To  state  that  a  type  is  a  subtype  of  some  other  type,  we  simply  append  a  subtype  clause  to 
its  specification.  We  allow  multiple  supertypes;  there  would  be  a  separate  subtype  clause  for 
each.  An  example  is  given  in  Figure  3. 

A  subtype’s  value  space  may  be  different  from  its  supertype’s.  For  example,  in  the  figure 
the  sort,  S,  for  bounded  stack  values  is  defined  in  BStack  as  a  pair,  <  items Jimit  >,  where 
items  is  a  sequence  of  integers  and  limit  is  a  natural  number.  The  invariant  indicates  that  the 
length  of  the  stack’s  sequence  component  is  less  than  or  equal  to  its  limit.  Under  the  subtype 
clause  we  define  an  abstraction  function.  A,  that  relates  stack  values  to  bag  values  by  relying  on 
the  helping  function,  mk-clems,  that  maps  sequences  to  multisets  in  the  obvious  manner.  (We 
will  revisit  this  abstraction  function  in  Section  5.2.2.)  The  subtype  clause  also  lets  specifiers 
rename  methods  of  the  supertype,  e.g.,  push  for  put',  all  other  methods  of  the  supertype  are 
“inherited”  without  renaming,  e.g.,  equal.  In  the  pre-  and  post-conditions,  [  ]  stands  for  the 
empty  sequence,  ||  is  concatenation,  last  picks  off  the  last  element  of  a  sequence,  and  allButLast 
returns  a  new  sequence  with  all  but  the  last  element  of  its  argument. 

5.2  First  Definition:  Constraint  Rule 

Our  first  definition  of  the  subtype  relation  relies  on  the  addition  of  some  information  to  speci¬ 
fications,  namely  a  constraint  clause  that  states  the  history  properties  of  the  type  explicitly^; 

^The  DM  of  the  term  “constraint''  is  borrowed  from  the  Ina  Jo  specification  language  [34],  which  also  includes 
constraints  in  specifications. 
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stack  =  type 


uses  BStack  (stack  for  5) 
for  all  s:  stack 

invariant  length{sp.iterTis)  <  Sp.limit 

push  =  proc  (t:  int) 

requires  length{sprt'items)  < 
modifies  s 

ensures  =  Spre*items  ||  ( » ]  A  Spogt.limit  =  Spre-Hniit 

pop  =  proc  0  returns  (int) 

requires  Sp,e.items  ^  [  ] 
modifies  s 

ensures  result  =  last{sprg.items)  A  Spogt-items  =  all ButLast{spre -items)  A 
Spogt -limit  =  Spre-limit 


swap.top  =  proc  (i:  int) 

requires  Spre-items  ^  [  ] 
modifies  s 

ensures  Spogt -items  =  allButLast{spre-items)  ||  [ «  ]  A  Spogt -limit  =  Spre -limit 

height  =  proc  (  )  returns  (int) 

ensures  result  =  length{spre-items) 

equal  =  proc  {t:  stack)  returns  (bool) 
ensures  result  =  (s  =  <) 

subtype  of  bag  {push  for  put,  pop  for  get,  height  for  card) 

Vst :  5  .  A(st)  =<  mk.elems(st.items),  st.limit  > 
where  mk.elems  :  Seq  — ►  M 
Vi :  Int,sq  :  Seq 

mkjelems{[  ])  =  {  } 

mkjelems{sq  ||  [  i  ])  =  mk.elems{sq)  U  {i} 


end  stack 


Figure  3:  Stack  Type 


if  the  constraint  is  trivial  (identically  equal  to  “true”),  the  clause  can  be  omitted.  For  example, 
we  add 

constraint  bp. bound  =  b^. bound 

to  the  specification  of  bag  to  declare  that  a  ba^’s  bound  never  changes.  We  would  add  a  similar 
clause  to  stack’s  specification.  As  another  example,  consider  a  fat-set  object  that  has  an  insert 
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but  no  delete  method;  fatjsets  only  grow  in  size.  The  constraint  for  fat  jet  would  be: 
constraint  V  x  ;  int  .  x  ^  Sp  x  E  Si(, 

We  can  formulate  history  properties  as  predicates  over  state  pairs.  The  predicate  appearing 
in  a  constraint  clause  is  an  abbreviation  for  a  history  property.  For  example,  bag’s  constraint 
expands  to  the  following:  For  any  computation,  c, 

V6  :  bag,  :  State  .  [p  <  ip  A  b  E  dom(p)]  =>  [bp.bound  = 

where  p  <  ip  means  that  state  p  precedes  state  ip  in  c.  Note  that  we  implicitly  quantify  over 
all  computations,  c,  and  do  not  require  that  ip  be  the  immediate  successor  of  p. 

Just  as  we  had  to  prove  that  methods  preserved  the  invariant,  we  must  show  that  they 
satisfy  the  constraint  by  proving  it  for  each  mutator.  We  do  this  by  using  the  history  rule: 

•  History  Rule:  For  each  mutator  m  of  r,  show  (m.pre  A  m.post)  =»  <P[xpre/xp,Xpott/x,p] 

where  is  a  history  property  on  objects  of  type  r.  P[a/b]  stands  for  predicate  P  with  every 
occurrence  of  b  replaced  by  a.  The  constraint  replaces  the  history  rule  as  far  as  users  are 
concerned:  users  can  make  deductions  based  on  the  constraint  but  they  cannot  reason  using 
the  history  rule  directly. 

The  formal  definition  of  the  subtype  relation,  <,  is  given  in  Figure  4.  It  relates  two  types,  o 
and  r,  each  of  whose  specifications  respectively  preserves  its  invariant,  I„  and  It,  and  satisfies 
its  constraint,  €„  and  Ct-  In  the  methods  and  constraint  rules,  since  x  is  an  object  of  type  o, 
its  value  (xp^e  or  ipoit)  is  a  member  of  5  and  therefore  cannot  be  used  directly  in  the  predicates 
about  r  objects  (which  are  in  terms  of  values  in  T).  The  abstraction  function  A  is  used  to 
translate  these  values  so  that  the  predicates  about  r  objects  make  sense. 

5.2.1  Discussion  of  Definition 

The  first  clause  addresses  the  need  to  relate  values.  It  requires  that  abstraction  functions  respect 
the  invariant:  an  abstraction  function  must  map  legal  values  of  the  subtype  to  legal  values  of 
the  supertype.  This  requirement  (and  the  assumption  that  the  type  specification  preserves 
the  invariant)  suffices  to  argue  that  invariant  properties  of  a  supertype  are  preserved  by  the 
subtype. 

The  second  clause  addresses  the  need  to  relate  non-extra  methods  of  the  subtype.  Our 
formulation  is  similar  to  America’s  [1].  The  first  two  signature  rules  are  the  standard  con- 
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Definition  of  the  subtype  relation,  c  <r  s=  <  Oa,S,M  >  is  a  subtype  of 
r  =  <  Ot,T,N  >  if  there  exists  an  abstraction  function,  A:  S  -*T,  and  a  renaming  map, 
R:  M  N,  such  that: 

1.  The  abstraction  function  respects  invariants: 

•  Invariant  Rule.  Vs  :  5  .  /<f(s)  ^  /t(j4(s)) 

A  may  be  partial,  need  not  be  onto,  but  can  be  many-to-one. 

2.  Subtype  methods  preserve  the  supertype  methods’  behavior.  If  of  r  is  the  corre¬ 
sponding  renamed  method  mg  of  <r,  the  following  rules  must  hold: 

•  Signature  rule. 

-  Contravariance  of  arguments,  and  have  the  same  number  of  argu¬ 
ments.  If  the  list  of  argument  types  of  is  a,-  and  that  of  is  /3i,  then 

Vt  .  Oi  <  0i. 

-  Covariance  of  result.  Either  both  and  m„  have  a  result  or  neither  has. 
If  there  is  a  result,  let  m^’s  result  type  be  7  and  m^’s  be  6.  Then  ^  <  7. 

—  Exception  rule.  The  exceptions  signaled  by  m,,  are  contained  in  the  set  of 
exceptions  signaled  by  m^. 

•  Methods  rule.  For  all  2  :  <7: 

-  Pre-condition  rule.  mT.prc[A(xp,e)/*}>re]  =>  mg.pre. 

-  Post-condition  rule,  m^.post  mr.post[A{xpre)/xpTe,A{Xj,o,t)/xpost] 

3.  Subtype  constraints  ensure  supertype  constraints. 

•  Constraint  Rule.  For  all  x  :  a  .  C<r(x)  =>  Ct[A{x p)  j x p,  A{x^) J x,^] 

Figure  4:  Definition  of  the  Subtype  Relation  (Constraint  Rule) 


tra/covariance  rules.  The  exception  rule  says  that  may  not  signal  more  than  m,.,  since  a 
caller  of  a  method  on  a  supertype  object  should  not  expect  to  handle  an  unknown  exception. 
The  pre-  and  post-condition  rules  are  the  intuitive  counterparts  to  the  contravariant  and  covari¬ 
ant  rules  for  signatures.  The  pre-condition  rule  ensures  the  subtype’s  method  can  be  called  in 
any  state  required  by  the  supertype  as  well  as  other  states.  The  post-condition  rule  says  that  the 
subtype  method’s  post-condition  can  be  stronger  than  the  supertype  method’s  post-condition; 
hence,  any  property  that  can  be  proved  based  on  the  supertype  method’s  post-condition  also 
follows  from  the  subtype’s  method’s  post-condition. 

We  do  not  consider  invariants  as  shorthand  for  explicit  conjuncts  in  a  method’s  pre-  and 
post-conditions  because  if  we  did  the  pre-condition  rule  would  require  that  the  supertype’s 
invariant  implies  a  subtype’s.  Usually  just  the  opposite  holds.  For  example,  suppose  a  smallbag 
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type  is  like  the  bag  type  except  that  its  bound  must  be  equal  to  20: 

invariant  |  bp.elems  |  <  bp.bound  A  bp.bound  =  20 

To  show  smallbag  is  a  subtype  of  bag,  for  the  pre-condition  rule  for  the  equal  method  we  would 
need  to  show  that: 

Ihag  ^  J smallbag 

which  is  not  true.  In  fact,  the  converse  holds. 

Finally,  the  third  clause  succinctly  and  directly  states  that  constraints  must  be  preserved. 
This  requirement  (and  the  assumption  that  each  type  specification  satisfies  its  constraint)  suf¬ 
fices  to  argue  that  history  properties  of  a  supertype  are  preserved. 

5.2.2  Applying  the  Definition  of  Subtyping  as  a  Checklist 

Proofs  of  the  subtype  relation  are  usually  obvious  and  can  be  done  by  inspection.  Typically, 
the  only  interesting  part  is  the  definition  of  the  abstraction  function;  the  other  parts  of  the 
proof  are  usually  trivial.  However,  this  section  goes  through  the  steps  of  an  informal  proof  just 
to  show  what  kind  of  reasoning  is  involved.  Formal  versions  of  these  informal  proofs  are  given 
in  [28]. 

Let’s  revisit  the  stack  and  bag  example  using  our  definition  as  a  checklist.  Here  a  =  < 
OstackjS,  {push,  pop^swapJop,  height,  equal}  >,  and  r  =  <  Ohagi  B,  {put,  get,  card,  equal]  >. 
Recall  that  we  represent  a  bounded  bag’s  value  as  a  pair,  <  elems, bound  >,  of  a  multiset 
of  integers  and  a  fixed  bound,  and  a  bounded  stack’s  value  as  a  pair,  <  items,  limit  >,  of  a 
sequence  of  integers  and  a  fixed  bound.  It  can  easily  be  shown  that  each  specification  preserves 
its  invariant  and  satisfies  its  constraint. 

We  use  the  abstraction  function  and  the  renaming  map  given  in  the  specification  for  stack 
in  Figure  3.  The  abstraction  function  states  that  for  all  st :  5 
i4(st)  =  <  mkjelems{st.items),3t.limit  > 

where  the  helping  function,  mkjelems  :  Seq  — *  A/,  maps  sequences  to  multisets  and  states  that 
for  all  sq  :  Seq,  i  :  /nt: 
mk.elems(l  ])  =  {  } 

mk.elems{sq  ||  [  t  ])  =  mkjelems{sq)  U  {*} 

The  renaming  map  R  is 
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R(push)  =  put 
R(pop)  =  9«t 
R(height)  =  card 
R(equal)  =  equal 

Checking  the  signature  and  exception  rules  is  easy  and  could  be  done  by  the  compiler. 

Next,  we  show  the  correspondences  between  push  and  put,  between  pop  and  get,  etc.  Let’s 
look  at  the  pre-  and  post-condition  rules  for  just  one  method,  push.  Informally,  the  pre-condition 
rule  for  put/ push  requires  that  we  show^: 

1  i4(spre)-e^«”»a  I  <  A{Spre) -bound 

length{spre-items)  <  Spre-Hmit 

Intuitively,  the  pre-condition  rule  holds  because  the  length  of  stack  is  the  same  as  the  size  of 
the  corresponding  bag  and  the  limit  of  the  stack  is  the  same  as  the  bound  for  the  bag.  Here  is 
an  informal  proof  with  slightly  more  detail: 

1.  A  maps  the  stack’s  sequence  component  to  the  bag’s  multiset  by  putting  all  elements  of 
the  sequence  into  the  multiset.  Therefore  the  length  of  the  sequence  Sj^g.items  is  equaJ 
to  the  size  of  the  multiset  44(Spre)-e/cTns. 

2.  Also,  A  maps  the  limit  of  the  stack  to  the  bound  of  the  bag  so  that  Sj,rt.limit  = 
A{Spre) -bound. 

3.  From  put's  pre-condition  we  know  l€ngth{spre -items)  <  Spre.limit. 

4.  push's  pre-condition  holds  by  substituting  equals  for  equals. 

Note  the  role  of  the  abstraction  function  in  this  proof.  It  allows  us  to  relate  stack  and  bag 
values,  and  therefore  we  can  relate  predicates  about  bag  values  to  those  about  stack  values  and 
vice  versa.  Also,  note  how  we  depend  on  A  being  a  function  (in  step  (4)  where  we  use  the 
substitutivity  property  of  equality). 

The  post-condition  rule  requires  that  we  show  push's  post-condition  implies  put's.  We  can 
deal  with  the  modifies  and  ensures  parts  separately.  The  modifies  part  holds  because  the 
same  object  is  mentioned  in  both  specifications.  The  ensures  part  follows  from  the  definition 
of  the  abstraction  function. 

Finally,  the  constraint  rule  requires  that  we  show  that  the  constraint  on  stacks: 

Sp.limit  =  s,f,.limit 

implies  that  on  bags: 

*Note  that  we  are  reaaoning  in  terms  of  the  values  of  the  object,  s,  and  that  b  and  s  refer  to  the  same  object. 
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bp.bound  =  b^.bound 

This  is  true  because  the  length  of  the  sequence  component  of  a  stack  is  the  same  as  the  size  of 
the  miiltiset  component  of  its  bag  counterpart. 

Note  that  we  do  not  have  to  say  anything  specific  for  swap-top. 

5.3  Second  Definition:  Extension  Map 

With  the  constraint  approach  users  cannot  use  the  history  rule  to  deduce  history  properties. 
Our  second  approach  allows  them  to  do  so.  It  requires  that  we  “explain”  each  extra  method  in 
terms  of  existing  methods.  If  such  explanations  are  possible,  the  extra  methods  do  not  add  any 
behavior  that  could  not  have  been  effected  in  their  absence.  Therefore,  all  supertype  properties, 
including  history  properties,  are  preserved. 

In  our  alternative  definition,  therefore,  we  do  not  add  any  constraints  to  our  type  specifica¬ 
tion  (and  thus  remove  the  requirement  that  a  type  specification  has  to  satisfy  its  constraint). 
Instead,  to  show  that  <t  is  a  subtype  of  r  we  require  a  third  mapping,  which  we  call  an  exten¬ 
sion  map,  that  is  defined  for  all  extra  methods  introduced  by  the  subtype.  The  extension  map 
“explains”  the  behavior  of  each  extra  method  as  a  program  expressed  in  terms  of  non-extra 
methods.  Interesting  explanations  are  needed  only  for  mutators;  non-mutators  always  have  the 
“empty”  explanation,  e. 

Figure  5  gives  the  alternative  definition.  As  before,  we  assume  each  type  specification 
preserves  its  invariant.  In  defining  the  extension  map,  we  intentionally  leave  unspecified  the 
language  in  which  one  writes  a  program,  but  imagine  that  it  has  the  usual  control  structures, 
assignment,  procedure  call,  etc. 

5.3.1  Discussion  of  Definition 

The  first  and  second  clauses  are  the  same  as  in  the  first  definition  except  that  the  pre-condition 
rule  is  stronger.  Since  the  extension  map  is  defined  just  for  the  extra  methods,  it  is  possible 
for  a  subtype  to  redefine  a  supertype’s  (non-extra)  method  in  a  way  that  causes  a  violation  of 
a  history  property  of  the  supertype.  For  example,  suppose  we  have  a  window,  w,  with  a  move 
method 

move  =  proc  (v;  vector) 

requires  v.x  >  0  A  v.y  >  0 
ensures  Wpost -center  =  Wpre-center  -h  v 

that  guarantees  a  window  always  moves  in  a  northeasterly  direction.  Suppose  a  my  .window 
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Definition  of  the  subtype  relation,  c  or  =  <  0^,5, M  >  is  a  subtype  of 
r  =  <  OrtT^N  >  if  there  exists  an  abstraction  fonction,  A,  a  renaming  map,  R,  and  an 
extension  map,  E,  snch  that: 


1.  The  abstraction  function  respects  invariants: 

•  Invariant  Rule.  Vs  :  5  .  /<r(s)  =*■  Ir(A(s)) 

2.  Subtype  methods  preserve  the  supertype  methods’  behavior.  If  itIt  of  r  is  the  corre¬ 
sponding  renamed  method  m„  of  <r,  the  following  rules  must  hold: 

•  Signature  rule. 

-  Contravariance  of  arguments,  and  m„  have  the  same  number  of  argu¬ 
ments.  If  the  list  of  argument  types  of  is  a,-  and  that  of  mg  is  jd,,  then 
Vi  .  a,-  <  fix. 

-  Covariance  of  result.  Either  both  amd  m„  have  a  result  or  neither  has. 

If  there  is  a  result,  let  mr’s  result  type  be  7  and  m„’s  be  8.  Then  6  <  ■y. 

-  Exception  rule.  The  exceptions  signaled  by  are  contained  in  the  set  of 
exceptions  signaled  by  m^. 

•  Methods  rule.  For  all  a: :  <t: 

—  Pre-condition  rule.  mT.pre[i4(xp,.e)/xpTe]  =  rn^.pre. 

-  Post-condition  rule,  m^.post  ^  mr.post[A{xpre)/XprexA(xpoBt)/xpoat] 

3.  The  extension  map,  E  :  Og  x  M  x  Obj*  —*  Prog,  must  be  defined  for  each  method, 
m,  not  in  dom(R).  We  write  E(x.m(a))  for  E{x,tu,a)  where  x  is  the  object  on  which 
m  is  invoked  and  a  is  the  (possibly  empty)  sequence  of  arguments  to  m.  E’s  range  is 
the  set  of  programs,  including  the  empty  program  denoted  as  e. 

•  Extension  rule.  For  each  new  method,  m,  of  x  :  <7,  the  following  conditions  must 
hold  for  IT,  the  program  to  which  E(x.m(a))  maps: 

-  The  input  to  x  is  the  sequence  of  objects  [x]||a. 

-  The  set  of  methods  invoked  in  x  is  contained  in  the  union  of  the  set  of 
methods  of  all  types  other  than  a  and  the  set  of  methods  dom{R). 

-  Diamond  rule.  We  need  to  relate  the  abstracted  values  of  x  at  the  end  of 
either  calling  just  m  or  executing  x.  Let  p\  be  the  state  in  which  both  m  is 
invoked  and  x  starts.  Assume  m.pre  holds  in  p\  and  the  call  to  m  terminates 
in  state  pi-  Then  we  require  that  x  terminates  in  state  rp  and 

A(xp,)  =  A(xv,). 

Note  that  if  x  =  €,V’  =  Pi- 

Figure  5:  Definition  of  the  Subtype  Relation  (Extension  Rule) 

type  is  just  like  window  except  with  a  weaker  move  method: 

move  =  proc  («:  vector) 

ensures  Wpo,t. center  =  Wprg.center  v 

The  methods  rule  given  previously,  in  particular  the  pre-condition  rule,  holds,  but  clients  of 
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Figure  6:  The  Diamond  Diagram 

window  objects  would  be  surprised  if  a  my  .window  object  were  used  (and  moved)  in  place  of  a 
window.®  To  rule  out  this  behavior,  we  reqtiire  that  the  pre-condition  of  eau:h  non-extra  method 
be  the  same  as  the  corresponding  supertype’s  method.®  Note  that  America  uses  the  weaker 
pre-condition  rule  of  Figure  4,  and  therefore  he  would  erroneously  allow  subtype  relations  like 
this  one,  since  his  technique  does  not  describe  the  constraints  explicitly. 

The  third  clause  of  the  definition  requires  what  is  shown  in  the  diamond  diagram  in  Figure  6, 
read  from  top  to  bottom.  We  must  show  that  the  abstract  value  of  the  subtype  object  reached 
by  running  the  extra  method  m  is  also  reached  by  running  m’s  explanation  program.  This 
diagram  is  not  quite  like  a  standard  commutative  diagram  because  we  are  applying  subtype 
methods  to  the  same  subtype  object  in  both  cases  (m  and  E(x.m(a)))  and  then  showing  the 
two  values  obtained  map  via  the  abstraction  function  to  the  same  supertype  v<ilue. 

The  extension  rule  constrains  only  what  an  explanation  program  does  to  its  method’s  object, 
not  to  other  objects.  This  makes  sense  because  explanation  programs  do  not  really  run.  Its 
purpose  is  to  explain  how  an  object  could  be  in  a  particular  state.  Its  other  arguments  are 
hypothetical;  they  are  not  objects  that  actually  exist  in  the  object  universe. 

The  diamond  rule  is  stronger  than  necessary  because  it  requires  equality  between  ab¬ 
stract  values.  We  need  only  the  weaker  notion  of  observable  equivalence  (e.g.,  see  Kapur’s 
definition[20]),  since  values  that  are  distinct  may  not  be  observably  different  if  the  supertype’s 
set  of  methods  (in  particular,  observers)  is  too  weak  to  let  us  perceive  the  difference.  In  practice, 

^Thanks  to  Ian  Maung  for  pointing  out  this  problem  and  inspiring  this  example. 

*An  alternative  solution  to  this  problem  would  be  to  define  the  extension  map  for  all  methods,  not  just  extra 
ones. 
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such  types  are  rare  and  therefore  we  did  not  bother  to  provide  the  weaker  definition. 

Preservation  of  history  properties  is  ensured  by  a  combination  of  the  methods  and  extension 
rules;  they  together  guarantee  that  any  call  of  a  subtype  method  can  be  explained  in  terms  of 
calls  of  methods  that  are  already  defined  for  the  supertype.  To  show  that  history  properties  are 
preserved  by  non-extra  mutators,  we  use  the  methods  rule.  However,  because  the  properties  are 
not  stated  explicitly,  they  cannot  be  proved  for  the  extra  methods.  Instead  extra  methods  must 
satisfy  any  possible  property,  which  is  surely  guaranteed  if  the  extra  methods  can  be  explained 
in  terms  of  the  non-extra  methods  via  the  extension  map. 

5.3.2  The  Bag  and  Stack  Example  Again 

The  alternative  definition  of  subtyping  is  also  used  as  a  checklist  to  prove  a  subtype  relation. 
Besides  the  abstraction  function,  the  only  other  interesting  issue  is  the  definition  of  the  extension 
map.  As  was  the  case  with  the  constraint  approach,  the  actual  proofs  are  usually  trivial. 

To  prove  that  stack  is  a  subtype  of  bag  we  follow  the  same  procedure  as  in  Section  5.2.2, 
except  we  need  to  show  that  the  pre-conditions  are  identical,  a  trivial  exercise  for  this  example. 
VVe  must  additionally  define  an  extension  map  to  define  swapJop's  effect.  As  stated  earlier,  it 
has  the  same  effect  as  that  described  by  the  program,  ir,  in  which  a  call  to  pop  is  followed  by 
one  to  push: 

E(s.swapJop(i))  =  s.pop();  s.push(i) 

Showing  the  extension  rule  is  just  like  showing  that  an  implementation  of  a  procedure  satisfies 
the  procedure’s  specification,  except  that  we  do  not  require  equal  values  at  the  end,  but  just 
equal  abstract  values.  (In  fact,  such  a  proof  is  identical  to  a  proof  showing  that  an  imple¬ 
mentation  of  an  operation  of  an  abstract  data  type  satisfies  its  specification  [1 9].)  In  doing  the 
reasoning  we  rely  on  the  specifications  of  the  methods  used  in  the  program.  Here  is  an  informal 
argument  for  swapJop.  We  note  first  that  since  s.swap.top(i)  terminates  normally,  so  does  the 
call  on  s.popO  (their  pre-conditions  are  the  same).  Pop  removes  the  top  element,  reducing  the 
size  of  the  stack  so  that  push's  pre-condition  holds,  and  then  push  puts  i  on  the  top  of  the 
stack.  The  result  is  that  the  top  element  has  been  replaced  by  ».  Thus,  =  s^,  where  P2  is 
the  termination  state  if  we  run  swapJop  and  is  the  termination  state  if  we  run  t.  Therefore 
A{sp^)  =  i4(s^),  since  A  is  a  function. 
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5.4  Comparing  the  Two  Definitions 

The  approach  using  explicit  constraints  is  appealing  because  it  is  so  simple.  In  addition,  explicit 
constraints  allow  us  to  rule  out  unintended  properties  that  happen  to  be  true  because  of  an  error 
in  a  method  specification.  Having  both  the  constraint  and  the  method  specifications  is  a  form  of 
useful  redimdancy:  If  the  two  are  not  consistent,  this  indicates  an  error  in  the  specification.  The 
error  can  then  be  removed  (either  by  changing  the  constraint  or  some  method  specification). 
Therefore,  including  constraints  in  specifications  makes  for  a  more  robust  methodology. 

Explicit  constraints  also  allow  us  to  state  the  common  properties  of  type  families  directly. 
With  the  explanation  approach,  it  is  sometimes  necessary  to  introduce  extra  methods  in  the 
supertype  to  ensure  that  history  properties  that  do  not  hold  for  subtypes  cannot  be  proved  for 
supertypes.  An  example  is  giver  in  Section  6,  when  we  discuss  a  varying.bag  type. 

On  the  minus  side  is  the  loss  of  the  history  rule.  Users  are  not  permitted  to  use  the  history 
rule  because  if  they  did,  they  might  be  able  to  prove  history  properties  that  a  subtype  did  not 
ensure.  Therefore  the  specifier  must  be  careful  to  define  a  strong  enough  constraint.  In  our 
experience  the  desired  constraint  is  usually  obvious.  However,  suppose  the  definer  of  fatjset 
mistakenly  gives  the  following  constraint: 

constraint  |  |  <  |  3,/,  | 

Users  would  then  be  unable  to  deduce  that  once  an  element  is  added  to  a  fat_set  it  will  always 
be  there  (since  they  are  not  allowed  to  use  the  history  rule). 

In  summary,  having  an  explicit  constraint  is  appealing  because  the  subtype  relation  is 
simple,  it  allows  us  to  state  properties  of  type  families  declaratively,  and  the  constraint  acts  as 
a  check  on  the  correctness  of  a  specification.  The  drawback  is  that  if  some  property  is  left  out 
of  the  constraint,  there  is  no  way  users  can  make  use  of  it. 

6  Type  Hierarchies 

The  requirement  we  impose  on  subtypes  is  very  strong  and  raises  a  concern  that  it  might 
rule  out  many  useful  subtype  relations.  To  address  this  concern  we  applied  our  method  to  a 
number  of  examples.  We  found  that  our  technique  captures  what  people  want  from  a  hierarchy 
mechanism,  but  we  also  discovered  some  surprises. 

The  examples  led  us  to  classify  subtype  relationships  into  two  broad  categories.  In  the 
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first  category,  the  subtype  extends  the  supertype  by  providing  additional  methods  and  possibly 
additional  “state.”  In  the  second,  the  subtype  is  more  constrained  than  the  supertype.  We 
discuss  these  relationships  below. 

6.1  Extension  Subtypes 

A  subtype  extends  its  supertype  if  its  objects  have  extra  methods  in  addition  to  those  of 
the  supertype.  Abstraction  functions  for  extension  subtypes  are  onto,  i.e.,  the  range  of  the 
abstraction  function  is  the  set  of  all  legal  values  of  the  supertype.  The  subtype  might  simply 
have  more  methods;  in  this  case  the  abstraction  function  is  one-to-one.  Or  its  objects  might 
also  have  more  “state,”  i.e.,  they  might  record  information  that  is  not  present  in  objects  of  the 
supertype;  in  this  case  the  abstraction  function  is  many-to-one. 

As  an  example  of  the  one-to-one  case,  consider  a  type  intset  (for  set  of  integers)  with  methods 
to  insert  and  delete  elements,  to  select  elements,  and  to  provide  the  size  of  the  set.  A  subtype, 
myJntset,  might  have  more  methods,  e.g.,  union,  is.empty.  Here  there  is  no  extra  state,  just 
extra  methods.  If  we  are  using  the  extension  map  approach,  we  must  provide  explanations 
for  the  extra  methods,  but  for  all  but  mutators,  these  are  trivial.  Thus,  if  union  is  a  pure 
constructor,  it  has  the  empty  explanation,  c;  otherwise  it  requires  a  non-trivial  explanation, 
e.g.,  in  terms  of  insert.  If  we  are  using  the  constraint  approach,  we  must  prove  that  the  subtype’s 
constraint  implies  that  of  the  supertype.  Often  the  two  constraints  will  be  identical,  e.g.,  both 
intset  and  myJntset  might  have  the  trivial  constraint. 

Using  either  approach,  it  is  easy  to  discover  when  a  proposed  subtype  really  is  not  one. 
For  example,  intset  is  not  a  subtype  of  fat-set  because  fat-sets  only  grow  while  intsets  grow 
and  shrink,  i.e.,  it  does  not  preserve  various  history  properties  of  fat_set.  If  we  are  using  the 
constraint  approach,  we  would  be  unable  to  show  that  the  intset  contraint  (which  is  trivial) 
implies  that  of  fatjset;  with  the  extension  map  approach,  we  will  not  be  able  to  explain  the 
effect  of  intset ’s  delete  method. 

As  a  simple  example  of  a  many-to-one  case,  consider  immutable  pairs  and  triples  (Figure 
7).  Pairs  have  methods  that  fetch  the  first  and  second  elements;  triples  have  these  methods 
plus  an  additional  one  to  fetch  the  third  element.  Triple  is  a  subtype  of  pair  and  so  is  semi- 
mutable  triple  with  methods  to  fetch  the  first,  second,  and  third  elements  and  to  replace  the 
third  element  because  replacing  the  third  element  does  not  affect  the  first  or  second  element. 
This  example  shows  that  it  is  possible  to  have  a  mutable  subtype  of  an  immutable  supertype. 
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provided  the  mutations  are  invisible  to  users  of  the  supertype. 


immutabk  pair 


immutable  triple  semi-mutable  triple 

Figure  7:  Pairs  and  Triples 


Mutations  of  a  subtype  that  would  be  visible  through  the  methods  of  an  immutable  super¬ 
type  are  ruled  out.  For  example,  an  immutable  sequence,  whose  elements  can  be  fetched  but 
not  stored,  is  not  a  supertype  of  mutable  array,  which  provides  a  store  method  in  addition  to 
the  sequence  methods.  For  sequences  we  can  prove  elements  do  not  change;  this  is  not  true 
for  arrays.  The  attempt  to  construct  the  subtype  relation  will  fail  because  there  is  no  way  to 
explain  the  store  method  via  an  extension  map  or  because  the  constraint  for  sequences  does 
not  follow  from  that  of  arrays. 

Many  examples  of  extension  subtypes  are  found  in  the  literature.  One  common  example 
concerns  persons,  employees,  and  students  (Figure  8).  A  person  object  has  methods  that 
report  its  properties  such  as  its  name,  age,  and  possibly  its  relationship  to  other  persons  (e.g., 
its  parents  or  children).  Student  and  employee  are  subtypes  of  person;  in  eaw:h  case  they  have 
additional  properties,  e.g.,  a  student  id  number,  an  employee  employer  and  salary.  In  additiin, 
type  student.employee  is  a  subtype  of  both  student  and  employee  (and  also  person,  since  the 
subtype  relation  is  transitive).  In  this  example,  the  subtype  objects  have  more  state  than  those 
of  the  supertype  as  well  as  more  methods. 


Figure  8:  Person,  Student,  and  Employee 


Another  example  from  the  database  literature  concerns  different  kinds  of  ships  [18].  The  su¬ 
pertype  is  generic  ships  with  methods  to  determine  such  things  as  who  is  the  captain  and  where 
the  ship  is  registered.  Subtypes  contain  more  specialized  s^ips  such  as  tankers  and  freighters. 
There  can  be  quite  an  elaborate  hierarchy  (e.g.,  tankers  are  a  special  kind  of  freighter).  Windows 
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are  anoilior  well-known  example  [17];  subtypes  include  bordered  windows,  colored  windows,  and 
scrollable  windows. 

Coininou  examples  of  subtype  relationships  are  allowed  by  our  definition  provided  the  equal 
method  (and  other  similar  niethcds)  are  defined  properly  ip  the  subtype.  Suppose  supertype  r 
provide;-,  an  equal  method  and  consider  a  particular  call  x.equol(y).  The  difficulty  arises  when 
X  and  y  actually  belong  to  <t,  a  subtype  of  r.  If  objects  of  the  subtype  have  additional  state, 
X  and  y  may  differ  when  considered  as  subtype  objects  but  ought  to  be  considered  equal  when 
considered  as  supertype  objects. 

For  example,  consider  immutable  triples  x  =<  0,0,0  >  and  y  =<  0,0,1  >.  Suppose  the 
specification  of  the  equal  method  for  pairs  says: 

equal  —  proc  {q:  pair)  returns  (bool) 

ensures  result  =  {p. first  =  q. first  Ap.second  =  q.second) 

(We  are  using  p  to  refer  to  the  method’s  object.)  However,  for  triples  we  would  expect  the 

following  specification: 

equal  =  proc  {q:  triple)  returns  (bool) 

ensures  result  =  (p. first  =  q. first  A  p. second  =  q.second  A  p.third  =  q.third) 

If  a  program  using  triples  had  just  observed  that  x  and  y  differ  in  their  third  element,  we  vvould 

expect  x.equal(y)  to  return  “false.”  However,  if  the  program  were  using  them  as  pairs,  and  had 

just  observed  that  their  first  and  second  elements  were  equal,  it  would  be  wrong  for  the  equal 

method  to  return  false. 

The  way  to  resolve  this  dilemma  is  to  have  two  equal  methods  in  triple: 

pair.equal  =  proc  {p:  pair)  returns  (bool) 

ensures  result  =  {p. first  =  q. first  A  p.second  =  q.second) 

lriple..equal  =  proc  (p:  triple)  returns  (bool) 

ensures  result  =  {p. first  =  q. first  A  p.second  =  q.second  A  p.third  =  q.third) 

One  of  them  {pair.equat)  simulates  the  equal  method  for  pair;  the  other 

(triple.equal)  is  a  method  just  on  triples. 

The  problem  is  not  limited  to  equality  methods.  It  also  affects  methods  that  “expose”  the 
abstract  state  of  objects,  e.g.,  an  unparse  method  that  returns  a  string  representation  of  the 
abstract  state  of  its  object.  x.unparse()  ought  to  return  a  representation  of  a  pair  if  called  in  a 
context  in  which  z  is  considered  to  be  a  pair,  but  it  ought  to  return  a  representation  of  a  triple 
in  a  context  in  which  z  is  known  to  be  a  triple  (or  some  subtype  of  triple). 

The  need  for  several  equality  methods  seems  natural  for  realistic  examples.  For  example, 
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asking  whether  el  and  e2  are  the  same  person  is  different  from  asking  if  they  are  the  same 
employee.  In  the  case  of  a  person  holding  two  jobs,  the  answer  might  be  true  for  the  question 
about  person  but  false  for  the  question  about  employee. 

6.2  Constrained  Subtypes 

The  second  type  of  subtype  relation  occurs  when  the  subtype  is  more  constrained  than  the  su¬ 
pertype.  In  this  case,  the  supertype  specification  will  always  be  nondeterministic;  its  purpose  is 
to  allow  variations  in  behavior  among  its  subtypes.  Subtypes  constrain  the  supertype  by  reduc¬ 
ing  or  eliminating  the  nondeterminism,  either  in  what  the  methods  do  or  in  the  value  spaces 
of  objects  or  by  having  a  tighter  constraint.  The  abstraction  function  is  usually  into  rather 
than  onto.  The  subtype  may  extend  those  supertype  objects  that  it  simulates  by  providing 
additional  methods  and/or  state. 

A  very  simple  example  concerns  elephants.  Elephants  come  in  many  colors  (realistically 
grey  and  white,  but  we  will  also  allow  blue  ones).  However  all  albino  elephants  are  white  and 
all  royal  elephants  are  blue.  Figure  9  shows  the  elephant  hierarchy.  The  set  of  legal  values  for 
regular  elephants  includes  all  elephants  whose  color  is  grey  or  blue  or  white: 
invariant  ep. color  =  white  V  tp.color  =  grey  V  Cp.color  =  blue 

The  set  of  legal  values  for  royal  elephants  is  a  subset  of  those  for  regular  elephants: 
invariant  Cp. color  =  blue 

and  hence  the  abstraction  function  is  into.  The  situation  for  albino  elephants  is  similar.  In 
.s^ldition,  the  get.color  method  for  elephant  is  non-deterministic  but  deterministic  for  royal  and 
albino  elephants.  This  simple  example  has  led  others  to  define  a  subtyping  relation  that  requires 
non-monotonic  reasoning  [25],  but  we  believe  it  is  better  to  use  a  nondeterministic  specification 
and  straightforward  reasoning  methods.  However,  the  example  shows  that  a  specifier  of  a  type 
family  has  to  anticipate  subtypes  and  capture  the  variation  among  them  in  a  nondeterministic 
specification  of  the  supertype. 

elephant 

royal  albino 

Figure  9:  Elephant  Hierarchy 
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Another  similar  example  concerns  geometric  figures.  At  the  top  of  the  hierarchy  is  the 
polygon  type;  it  allows  an  arbitrsury  number  of  sides  and  angles  of  arbitrary  sizes.  Subtypes 
place  various  restrictions  on  these  quantities.  A  portion  of  the  hierarchy  is  shown  in  Figure  10. 


polygon 


triangle  quadrilateral 

scalene  rfaombua 

I  I 

isosceles  square 

Figure  10:  Polygon  Hierarchy 

The  bag  type  discussed  in  Section  4.1  is  nondeterministic  in  two  ways.  As  discussed  earlier, 
the  specification  of  get  is  nondeterministic  because  it  does  not  constrain  which  element  of  the 
bag  is  removed.  This  nondeterminism  allows  stack  to  be  a  subtype  of  bag:  The  specification 
of  pop  constrains  the  nondetermiiusm.  We  could  also  define  a  queue  that  is  a  subtype  of  bag; 
its  dequeue  method  would  also  constrain  the  nondeterminism  of  get  but  in  a  way  different  from 
pop. 

In  addition,  since  the  actual  value  of  the  bound  for  bags  was  not  defined,  it  can  be  any 
natural  number,  thus  allowing  subtypes  to  have  different  bounds.  This  nondeterminism  shows 
up  in  the  specification  of  put,  where  we  do  not  say  what  specific  bound  value  causes  the  call 
to  fail.  Therefore,  a  user  of  put  must  be  prepared  for  a  failure  unless  it  is  possible  to  deduce 
from  past  evidence,  using  the  history  property  (or  constraint)  that  the  bound  of  a  bag  does  not 
change,  that  the  call  will  succeed.  A  subtype  of  bag  might  limit  the  bound  to  a  fixed  value,  or 
to  a  smaller  range.  Several  subtypes  of  bag  are  shown  in  Figure  11;  mediumbags  have  various 
bounds,  so  that  this  type  might  have  its  own  subtypes,  e.g.,  bag.150. 

The  bag  hierarchy  may  seem  counterintuitive,  since  we  might  expect  that  bags  with  smaller 
bounds  should  be  subtypes  of  bags  with  larger  bounds.  For  example,  we  might  expect  smallbag 
to  be  a  subtype  of  largebag.  However,  the  specifications  for  the  two  types  are  incompatible: 
the  bound  of  every  largebag  is  2^^,  which  is  clearly  not  true  for  smallbags.  Furthermore,  this 
difference  is  observable  via  the  methods:  It  is  legal  to  call  the  put  method  on  a  largebag  whose 
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bag 


largebag  mediumbag  smallbag 

(bound(b)-2^h  (100  <~  bouiul(b)  <=  1000)  (bouiuKb)  ^  20) 

bag_150 
(bou/ul(b)  z  150) 

Figure  11:  A  Type  Family  for  Bags 

size  is  greater  than  or  equal  to  20,  but  the  call  is  not  legal  for  a  smallbag.  Therefore  the 
pre-condition  rule  is  not  satisfied. 

Although  the  bag  type  can  have  subtypes  with  different  bounds,  it  is  not  a  valid  supertype 
of  a  dynamic.bag  type  where  the  bounds  of  the  bags  can  change  dynamically.  Dynamic.bags 
would  have  an  additional  method,  change.bound: 

change.bound  =  proc  (n;  int) 
requires  n  >  \hprt.elems\ 
modifies  b 

ensures  =  bpre-elems  A  bpo,t.bound  =  n 

If  we  wanted  a  type  family  that  included  both  dynamic.bag  and  bag,  we  would  need  to  define 
a  supertype  in  which  the  bound  is  allowed,  but  not  required,  to  vary.  Figure  12  shows  the 
new  type  hierarchy.  This  example  points  out  an  interesting  difference  between  the  two  subtype 
definitions.  If  we  are  using  the  extension  map  approach,  varying.bag  would  need  to  have  a 
change.bound  method  that  allows  the  bag’s  bound  to  change,  but  does  not  require  it.  The 
method  is  needed  because  otherwise  the  history  rule  would  allow  us  to  deduce  that  the  bound 
does  not  change!  The  nondeterminism  in  its  specification  is  resolved  in  its  subtypes;  bag  (and  its 
subtypes)  provides  a  change.bound  method  that  leaves  the  bound  as  it  was,  while  dynamic.bag 
changes  it  to  the  new  bound.  Note  that  for  bag  to  be  a  subtype  of  varying.bag,  it  must 
have  a  change.bound  method  (in  addition  to  its  other  methods),  even  though  the  method  isn’t 
interesting. 

On  the  other  hand,  if  we  are  using  the  constraint  approach,  varying.bag  and  bag  need  not 
have  a  change.bound  method.  Instead,  varying.bag  simply  has  the  trivial  constraint.  This 
means  that  its  users  cannot  deduce  anything  about  the  bounds  of  its  objects:  the  bound  of 
an  object  might  change  or  it  might  not.  Therefore  it  can  have  both  bag  and  dynamic.bag 
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viiytag_btg 

(bmutd  may  change  or  stay  Ae  same) 


dynamic.bag 
(bound  may  change) 


bag 

(bound  stays  the  same) 


{...as  in  Fig.  11 ...] 

Figure  12:  Another  Type  Family  for  Bags 

as  subtypes.  The  constraint  for  bag  (that  a  bag’s  bound  does  not  change)  allows  users  of  its 
objects  to  depend  on  this  property. 

The  varying_bag  example  illustrates  a  subtype  that  reduces  nondeterminism  in  the  con¬ 
straint.  The  constraint  for  varying.bag  can  be  thought  of  as  being  “either  a  bag’s  bound 
changes  or  it  does  not”;  the  constraint  for  bounded-bag  reduces  this  nondeterminism  by  mak¬ 
ing  a  choice  (“the  bag’s  bound  does  not  change”).  A  similar  example  is  a  family  of  integer 
counters  shown  in  Figure  13.  When  a  counter  is  advanced,  we  only  know  that  its  vadue  gets 
bigger,  so  that  the  constraint  is  simply 

constraint  Cp  < 

The  doubler  and  multiplier  subtypes  have  stronger  constraints.  For  example,  a  multiplier’s 
value  always  increases  by  a  multiple,  so  that  its  constraint  is: 

constraint  3  n  :  int  .[n>0ACp  =  n*c^) 

For  a  family  like  this,  we  might  choose  to  have  an  advance  method  for  counter  (so  that  each  of 
its  subtypes  is  constrained  to  have  this  method)  or  we  might  not,  but  this  choice  is  available  to 
us  only  if  we  use  the  constraint  method. 


counter 

(value  never  decreases) 


incrementer  doubler  multiplier 
{ value  never  decreases)  ( value  doubles)  ( value  multiplies) 

Figure  13:  Type  Family  for  Counters 
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In  the  case  of  the  bag  family  illustrated  in  Figure  11,  all  types  in  the  hierarchy  might 
actually  be  implemented.  However,  sometimes  supertypes  are  not  intended  to  be  implemented; 
instead  they  are  virtual  types  that  let  us  define  the  properties  all  subtypes  have  in  common. 
Varying-bag  is  an  example  of  such  a  type. 

Virtual  types  are  also  needed  when  we  construct  a  hierarchy  for  integers.  Smaller  integers 
cannot  be  a  subtype  of  larger  integers  because  of  observable  differences  in  behavior;  for  example, 
an  overflow  exception  that  would  occur  when  adding  two  32-bit  integers  would  not  occur  if  they 
were  64-bit  integers.  Also,  larger  integers  cannot  be  a  subtype  of  smaller  ones  because  exceptions 
do  not  occur  when  expected.  However,  we  clearly  would  like  integers  of  different  sizes  to  be 
related.  This  is  accomplished  by  designing  a  nondeterministic,  virtual  supertype  that  includes 
them.  Such  a  hierarchy  is  shown  in  Figure  14,  where  integer  is  a  virtual  type.  Here  integer 
types  with  different  sizes  are  subtypes  of  integer.  In  addition,  small  integer  types  are  subtypes 
of  regularJnt,  another  virtual  type.  Such  a  hierarchy  might  have  a  structure  like  this,  or  it 
might  be  flatter  by  having  all  integer  types  be  direct  subtypes  of  integer. 


integer 


64-bit-int  regular_int 


32-bit-int  16-bit-int 


Figure  14:  Integer  Family 


7  Related  Work 

Some  of  the  research  on  defining  subtype  relations  is  concerned  with  capturing  constraints 
on  method  signatures  via  the  contra/covariance  rules,  such  as  those  used  in  languages  like 
Trellis/Owl  [33],  Emerald[3],  Quest  [5],  Eiffel  [30],  POOL  [1],  and  to  a  limited  extent  Modula-3 
[32].  Our  rules  place  constraints  not  just  on  the  signatures  of  an  object’s  methods,  but  also  on 
their  behavior. 

Our  work  is  most  similar  to  that  of  America  [2],  who  has  proposed  rules  for  determining 
based  on  type  specifications  whether  one  type  is  a  subtype  of  another.  (Meyer  uses  America's 
pre-  and  post-condition  rules  for  Eiffel  [30],  although  here  the  pre-  and  post-conditions  are  given 
“operationally,”  by  providing  a  program  to  check  them,  rather  than  assertionally.)  Cusack’s 
approach  [7]  also  relates  specifications;  her  rule  defines  subtyping  in  terms  of  strengthening 
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state  invariants.  However,  neither  author  considers  the  problems  introduced  by  extra  mutators 
nor  the  preservation  of  history  properties.  Therefore,  they  allow  certain  subtype  relations  that 
we  forbid  (e.g.,  intset  could  be  a  subtype  of  fat_set  in  these  approaches). 

The  emphasis  on  semantics  of  abstract  types  is  a  prominent  feature  of  the  work  by  Leavens. 
In  his  Ph.D.  thesis  [21]  Leavens  defines  types  in  terms  of  algebras  and  subtyping  in  terms 
of  a  simulation  relation  between  them.  The  work  by  Bruce  and  Wegner  [4]  is  similar;  like 
Leavens,  they  base  their  work  on  algebras.  Leavens  considered  only  immutable  types.  Dhara 
[10,  11,  23]  extends  Leavens’  thesis  work  to  deal  with  mutable  types,  but  rules  out  the  cases 
where  extra  methods  cause  problems;  the  rules  are  defined  just  for  individual  programs  that 
have  no  aliasing  between  objects  of  related  types,  and  therefore  state  changes  caused  by  a 
subtype’s  extra  methods  cannot  be  observed  through  the  supertype.  Because  of  this  restriction 
on  aliasing  they  allow  some  subtype  relations  to  hold  where  we  do  not.  For  example,  they  allow 
mutable  pairs  to  be  a  subtype  of  immutable  pairs  whereas  we  do  not. 

In  addition,  these  algebraic  approaches  are  not  constructive,  i.e.,  they  tell  you  what  to  look 
for,  but  not  how  to  prove  that  you  got  it.  Utting  [36]  does  provide  a  constructive  approach, 
but  he  bases  his  work  in  the  refinement  calculus  language  [31],  a  formalism  that  we  believe  is 
not  very  easy  for  programmers  to  deal  with.  Utting  is  not  concerned  with  preserving  history 
properties  in  the  presence  of  extra  methods  and  he  also  does  not  allow  data  refinement  between 
supertype  and  subtype  value  spaces. 

Others  have  worked  on  the  specification  of  types  and  subtypes.  For  example,  many  have 
proposed  Z  as  the  basis  of  specifications  of  object  types[8,  12,  6];  Goguen  and  Meseguer  use 
FOOPS[l5];  Leavens  and  his  colleagues  use  Larch[22,  24,  11].  Though  several  of  these  re¬ 
searchers  separate  the  specification  of  an  object’s  creators  from  its  other  methods,  none  has 
identified  the  problem  posed  by  the  missing  creators,  and  thus  none  has  provided  an  explicit 
solution  to  this  problem. 

In  summary,  our  work  is  similar  in  spirit  to  that  of  America  and  Cusack  because  they  take 
a  specification-based  approach  to  defining  a  behavioral  notion  of  subtyping.  It  complements 
the  algebraic  model-based  approach  taken  by  Leavens,  Dhara,  and  Bruce  and  Wegner.  Only 
America,  Cusack,  Utting,  and  Dhara  deal  with  mutability,  but  none  has  addressed  the  need  to 
preserve  history  properties.  Only  we  have  a  technique  that  works  in  a  general  environment  in 
which  objects  can  be  shared  among  possibly  concurrent  users. 
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8  Summary  and  Future  Work 


This  paper  defines  a  new  notion  of  the  subtype  relation  based  on  the  semantic  properties  of  the 
subtype  and  supertype.  An  object’s  type  determines  both  a  set  of  legal  values  and  an  interface 
with  its  environment  (through  calls  on  its  methods).  Thus,  we  are  interested  in  preserving 
properties  about  supertype  values  and  methods  when  designing  a  subtype.  We  require  that  a 
subtype  preserve  all  the  invariant  and  history  properties  of  its  supertype.  We  are  particularly 
interested  in  an  object’s  observable  behavior  (state  changes),  thus  motivating  our  focus  on 
history  properties  and  on  mutable  types  and  mutators. 

The  paper  presents  two  ways  of  defining  the  subtype  relation,  one  using  constraints  and  the 
other  using  the  extension  rule.  Either  of  these  approaches  guarantees  that  subtypes  preserve 
their  supertypes’  invariant  and  history  properties.  Ours  is  the  first  work  to  deal  with  history 
properties,  and  to  provide  a  way  of  determining  the  acceptability  of  the  “extra”  methods  in  the 
presence  of  mutability. 

The  paper  also  presents  a  way  to  specify  the  semantic  properties  of  types  formally.  One 
reason  we  chose  to  base  our  approach  on  Larch  is  that  Larch  allows  formal  proofs  to  be  done 
entirely  in  terms  of  specifications.  In  fact,  once  the  theorems  corresponding  to  our  subtyping 
rules  are  formally  stated  in  Larch,  their  proofs  are  aJmost  completely  mechanical — a  matter  of 
symbol  manipulation — and  could  be  done  with  the  assistance  of  the  Larch  Prover[14]. 

Although  we  gave  two  formal  definitions  of  the  subtype  relation,  we  did  not  formally  char¬ 
acterize  the  criterion  against  which  we  can  measure  the  soundness  of  our  definitions.  We  only 
argued  informally  that  our  definitions  guarantee  that  a  subtype’s  objects  behave  the  same,  e.g., 
preserve  properties,  as  their  supertype’s.  A  formal  characterization  of  this  criterion  remains 
another  open  research  problem.  One  possibility  is  to  do  this  within  the  Larch  framework.  In 
Larch,  the  meaning  of  a  specification  is  the  theory  derived  from  a  set  of  axioms  and  rules. 
A  possible  correctness  criterion  is  to  require  the  theory  of  a  subtype  to  contadn  those  of  its 
supertypes. 

In  developing  our  definitions,  we  were  motivated  primarily  by  pragmatics.  Our  intention  is 
to  capture  the  intuition  programmers  apply  when  designing  type  hierarchies  in  object-oriented 
languages.  However,  intuition  in  the  absence  of  precision  can  often  go  astray  or  lead  to  confu¬ 
sion.  This  is  why  it  has  been  unclear  how  to  organize  certain  type  hierarchies  such  as  integers. 
Our  definition  sheds  light  on  such  hierarchies  and  helps  in  uncovering  new  designs.  It  also  sup- 
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ports  the  kind  of  reasoning  that  is  needed  to  ensure  that  programs  that  work  correctly  using 
the  supertype  continue  to  work  correctly  with  the  subtype. 

We  believe  that  programmers  will  find  our  approaches  relatively  easy  to  apply  and  expect 
them  to  be  used  primarily  in  w  informal  way.  The  essence  of  a  subtype  relationship  (in  either 
of  our  approaches)  is  expressed  in  the  mappings.  We  hope  that  the  mappings  will  be  defined  as 
part  of  giving  type  and  subtype  specifications,  in  much  the  same  way  that  abstraction  functions 
and  representation  invariants  are  given  as  comments  in  a  program  that  implements  an  abstract 
type.  The  proofs  can  be  done  at  this  point  also;  they  are  usually  trivial  and  can  be  done  by 
inspection. 
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